翻訳と辞書
Words near each other
・ "O" Is for Outlaw
・ "O"-Jung.Ban.Hap.
・ "Ode-to-Napoleon" hexachord
・ "Oh Yeah!" Live
・ "Our Contemporary" regional art exhibition (Leningrad, 1975)
・ "P" Is for Peril
・ "Pimpernel" Smith
・ "Polish death camp" controversy
・ "Pro knigi" ("About books")
・ "Prosopa" Greek Television Awards
・ "Pussy Cats" Starring the Walkmen
・ "Q" Is for Quarry
・ "R" Is for Ricochet
・ "R" The King (2016 film)
・ "Rags" Ragland
・ ! (album)
・ ! (disambiguation)
・ !!
・ !!!
・ !!! (album)
・ !!Destroy-Oh-Boy!!
・ !Action Pact!
・ !Arriba! La Pachanga
・ !Hero
・ !Hero (album)
・ !Kung language
・ !Oka Tokat
・ !PAUS3
・ !T.O.O.H.!
・ !Women Art Revolution


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

representer theorem : ウィキペディア英語版
representer theorem

In statistical learning theory, a representer theorem is any of several related results stating that a minimizer f^ of a regularized empirical risk function defined over a reproducing kernel Hilbert space can be represented as a finite linear combination of kernel products evaluated on the input points in the training set data.
==Formal Statement==
The following Representer Theorem and its proof are due to Schölkopf, Herbrich, and Smola:
Theorem: Let \mathcal be a nonempty set and k a positive-definite real-valued kernel on \mathcal \times \mathcal with corresponding reproducing kernel Hilbert space H_k. Given a training sample (x_1, y_1), \dotsc, (x_n, y_n) \in \mathcal \times \R, a strictly monotonically increasing real-valued function g \colon [0, \infty) \to \R, and an arbitrary empirical risk function E \colon (\mathcal \times \R^2)^n \to \R \cup \lbrace \infty \rbrace, then for any f^ \in H_k satisfying
:
f^ = \operatorname_ \left\lbrace E\left( (x_1, y_1, f(x_1)), ..., (x_n, y_n, f(x_n)) \right) + g\left( \lVert f \rVert \right) \right \rbrace, \quad (
*)

f^ admits a representation of the form:
:
f^(\cdot) = \sum_^n \alpha_i k(\cdot, x_i),

where \alpha_i \in \R for all 1 \le i \le n.
Proof:
Define a mapping
:
\begin
\varphi \colon \mathcal &\to \R^

(so that \varphi(x) = k(\cdot, x) is itself a map \mathcal \to \R). Since k is reproducing kernel, then
:
\varphi(x)(x') = k(x', x) = \langle \varphi(x'), \varphi(x) \rangle,

where \langle \cdot, \cdot \rangle is the inner product on H_k.
Given any x_1, ..., x_n, one can use orthogonal projection to decompose any f \in H_k into a sum of two function, one lying in \operatorname \left \lbrace \varphi(x_1), ..., \varphi(x_n) \right \rbrace, and the other lying in the orthogonal complement:
:
f = \sum_^n \alpha_i \varphi(x_i) + v,

where \langle v, \varphi(x_i) \rangle = 0 for all i.
The above orthogonal decomposition and the reproducing property together show that applying f to any training point x_j produces
:
f(x_j) = \left \langle \sum_^n \alpha_i \varphi(x_i) + v, \varphi(x_j) \right \rangle = \sum_^n \alpha_i \langle \varphi(x_i), \varphi(x_j) \rangle,

which we observe is independent of v. Consequently, the value of the empirical risk E in (
*) is likewise independent of v. For the second term (the regularization term), since v is orthogonal to \sum_^n \alpha_i \varphi(x_i) and g is strictly monotonic, we have
:
\begin
g\left( \lVert f \rVert \right) &= g \left( \lVert \sum_^n \alpha_i \varphi(x_i) + v \rVert \right) \\
&= g \left( \sqrt \right) \\
&\ge g \left( \lVert \sum_^n \alpha_i \varphi(x_i) \rVert \right).
\end

Therefore setting v = 0 does not affect the first term of (
*), while it strictly decreasing the second term. Consequently, any minimizer f^ in (
*) must have v = 0, i.e., it must be of the form
:
f^(\cdot) = \sum_^n \alpha_i \varphi(x_i) = \sum_^n \alpha_i k(\cdot, x_i),

which is the desired result.

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「representer theorem」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.